Gradient Boost

Before moving forward with the to-do list, let’s throw a Random Forest to it.

Gradient boost

For many reasons, Random Forest is usually a very good baseline model. In this particular case I started with the polynomial OLS as baseline model, just because it was so evident from the correlations that the relationship between temperature and consumption follows a polynomial shape. But let’s go back to a beloved RF.

Model Cards provide a framework for transparent, responsible reporting. 
 Use the vetiver `.qmd` Quarto template as a place to start, 
 with vetiver.model_card()
Writing pin:
Name: 'wd-gb'
Version: 20251122T162849Z-6a5a6
♻️  stepit 'gb_raw': is up-to-date. Using cached result for `strom.modelling.assess_model()` 2025-11-22 16:28:49

Metrics

Single Split CV
train test test train
MAE - Mean Absolute Error 1.409974 2.338275 2.303979 1.381000
MSE - Mean Squared Error 3.723677 21.094079 23.726213 3.632431
RMSE - Root Mean Squared Error 1.929683 4.592829 3.941234 1.902960
R2 - Coefficient of Determination 0.960355 0.765333 0.384288 0.962644
MAPE - Mean Absolute Percentage Error 0.137213 0.226350 0.171635 0.127905
EVS - Explained Variance Score 0.960355 0.772304 0.432298 0.962644
MeAE - Median Absolute Error 1.028508 1.192214 1.476596 0.995582
D2 - D2 Absolute Error Score 0.796192 0.655386 0.337103 0.804585
Pinball - Mean Pinball Loss 0.704987 1.169138 1.151989 0.690500

Scatter plot matrix

Observed vs. Predicted and Residuals vs. Predicted

Check for …

check the residuals to assess the goodness of fit.

  • white noise or is there a pattern?
  • heteroscedasticity?
  • non-linearity?

Normality of Residuals:

Check for …

  • Are residuals normally distributed?

Leverage

Scale-Location plot

Residuals Autocorrelation Plot

Residuals vs Time

Again, overfits a lot.

Parameter: param_model__learning_rate

/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['0.1' '0.1' '0.1' ... '0.1' '0.1' '0.1']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['48' '48' '48' ... '48' '48' '48']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['60' '60' '60' ... '60' '60' '60']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['1' '1' '1' ... '1' '1' '1']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.

Parameter: param_model__max_depth

/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['0.1' '0.1' '0.1' ... '0.1' '0.1' '0.1']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['48' '48' '48' ... '48' '48' '48']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['60' '60' '60' ... '60' '60' '60']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['1' '1' '1' ... '1' '1' '1']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.

Parameter: param_model__min_samples_leaf

/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['0.1' '0.1' '0.1' ... '0.1' '0.1' '0.1']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['48' '48' '48' ... '48' '48' '48']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['60' '60' '60' ... '60' '60' '60']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['1' '1' '1' ... '1' '1' '1']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.

Parameter: param_model__min_samples_split

/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['0.1' '0.1' '0.1' ... '0.1' '0.1' '0.1']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['48' '48' '48' ... '48' '48' '48']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['60' '60' '60' ... '60' '60' '60']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['1' '1' '1' ... '1' '1' '1']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.

Parameter: param_model__n_estimators

/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['0.1' '0.1' '0.1' ... '0.1' '0.1' '0.1']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['48' '48' '48' ... '48' '48' '48']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['60' '60' '60' ... '60' '60' '60']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['1' '1' '1' ... '1' '1' '1']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.

Parameter: param_model__subsample

/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['0.1' '0.1' '0.1' ... '0.1' '0.1' '0.1']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['48' '48' '48' ... '48' '48' '48']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['60' '60' '60' ... '60' '60' '60']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['1' '1' '1' ... '1' '1' '1']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.

Parameter: param_vars__columns

/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['0.1' '0.1' '0.1' ... '0.1' '0.1' '0.1']' has dtype incompatible with float64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['5' '5' '5' ... '5' '5' '5']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['48' '48' '48' ... '48' '48' '48']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['60' '60' '60' ... '60' '60' '60']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
/home/runner/work/strom/strom/src/strom/strom.py:837: FutureWarning: Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value '['1' '1' '1' ... '1' '1' '1']' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.

Best model

{'model__learning_rate': 0.1,
 'model__max_depth': 5,
 'model__min_samples_leaf': 5,
 'model__min_samples_split': 48,
 'model__n_estimators': 60,
 'model__subsample': 1,
 'vars__columns': ['rf_tu_mean', 'vp_std_mean']}
♻️  stepit 'gb_tuned': is up-to-date. Using cached result for `strom.modelling.assess_model()` 2025-11-22 16:28:56

Metrics

Single Split CV
train test test train
MAE - Mean Absolute Error 1.563379 2.374218 2.208940 1.548676
MSE - Mean Squared Error 6.409281 21.100475 22.067249 6.006925
RMSE - Root Mean Squared Error 2.531656 4.593525 3.773892 2.440925
R2 - Coefficient of Determination 0.931762 0.765262 0.450495 0.938437
MAPE - Mean Absolute Percentage Error 0.141916 0.237917 0.170104 0.134164
EVS - Explained Variance Score 0.931762 0.771610 0.503243 0.938437
MeAE - Median Absolute Error 1.028037 1.231788 1.418443 1.030482
D2 - D2 Absolute Error Score 0.774018 0.650089 0.352238 0.780897
Pinball - Mean Pinball Loss 0.781690 1.187109 1.104470 0.774338

Scatter plot matrix

Observed vs. Predicted and Residuals vs. Predicted

Check for …

check the residuals to assess the goodness of fit.

  • white noise or is there a pattern?
  • heteroscedasticity?
  • non-linearity?

Normality of Residuals:

Check for …

  • Are residuals normally distributed?

Leverage

Scale-Location plot

Residuals Autocorrelation Plot

Residuals vs Time

Compare vanilla vs. tuned

Cross-validation messages

♻️  stepit 'cross_validate_pipe': is up-to-date. Using cached result for `strom.modelling.cross_validate_pipe()` 2025-11-22 16:29:00

♻️  stepit 'cross_validate_pipe': is up-to-date. Using cached result for `strom.modelling.cross_validate_pipe()` 2025-11-22 16:29:00

Metrics

Single split

Metrics based on the test set of the single split

Cross validation

Predictions, residuals, observed

next

Time vs. Predicted and Observed

Time vs. Residuals

Model details

Pipeline(steps=[('vars',
                 ColumnSelector(columns=['tt_tu_mean', 'rf_tu_mean', 'td_mean',
                                         'vp_std_mean', 'tf_std_mean'])),
                ('model', GradientBoostingRegressor(random_state=7))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Pipeline(steps=[('vars', ColumnSelector(columns=['rf_tu_mean', 'vp_std_mean'])),
                ('model',
                 GradientBoostingRegressor(max_depth=5, min_samples_leaf=5,
                                           min_samples_split=48,
                                           n_estimators=60, random_state=7,
                                           subsample=1))])
In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

TODOs